Setup and Discription of Dataset

This plot illustrates how well the algorithm seperated indiviual bivalents per cell image. Several MSM male spreads had tight cells with a high degree of overlap, so fewer bivalents were indiviudally segmented.

consider making a table with predictions for bivalent patterns which are related to the mechanism

Chrm Class proportions

Should I be thinking of making a MM or lm for predicting CO number (within a category)

but there is so little variance in the CO number

co ~ sc.length

I ran this model before, I remember that these models are better predictors (more significant for males than females)

maybe – female nuanced pattern – falls along the lines of

simple predictions not met as strongly as in male patterns…

this reasoning might be complicated by the fact that the range in CO averages is lower in females – so there is less power

the above plot displays the average proportion of each type of chromosome class across mice. The results mostly align with the hand measured results.

Look into ordering the mouse strains to highlight the MSM and PWD differences, renameing the chrm type labels, spacing out the x axis labels or removing some of the other mice.

Chromosome Size effect

In an attempt to deal with the chromosome effect size that is know to effect total rate and CO position, we try to make a dataset of the longest 25% SC from a cell.

The majority of cells from the BivData set do not have the complete completement of bivalents, (ie they are missing up to 5) – which makes it hard to use a absolute length threshold since some cells have shorter or longer bivalents compared to the while. So an absolute measure might catch the lower 25% of some cells.

I’d like to figure out how to best seperate out the true 4th quartille bivalents (longest 25%), even with this missing data. double check that 4th quartille is top 5/4

Could I use the total SC? (I would be warry: can’t ensure that they apply the same skeletonize algorithm/parameters and the XY is also included in that count and will mess everything up.)

outline logic for filtering / choosing BivData set of almost whole cells

  • 16 to 20 bivalents in the cells
## Warning: Removed 3 rows containing missing values (geom_point).

## Warning: Removed 1 rows containing missing values (geom_point).

All the plots above show the distrtributions of manually whole cell measured SC lengths compared to the SC length distributions from the automated bivalent data. It shows the amount of within cell variance across strains. There is a bit of variance across the SC length distributions in the PWD females.

I’m purposing to take all the automated biv measures above the cell’s 4th quartille as a dataset for ‘longestest’ bivalents which we would predict to exclude the shortest chromsomes which would have stronger chromsome size effects.

But this dataset might be noisy, given the amount of variance in the SC length distributions across cells (PWD females, WSB females). But it’s easier to just run another set of analysis.

Biv_Data_mean_table has the Q4 values

Well fixing those bigs took forever. Use the DF, real.long.bivData for comparing the longest bivalents. still v unblancd across sexes and strains ~ merge with the manual whole cell biv measures later…

MSM male only has data from 1 cell, and PWD 3, uh oh – may need to merge sooner.

First proof of principl will be comparisons within category (long vs all)

in code chunk above I ran the mouse averages for the longest bivalets (680 bivalents, from 54 mice. full dataset is 10202 bivalents from 86 mice.

## Warning: Removed 3 rows containing missing values (geom_point).

These plots show the SC lengths for bivalents I’m calling the ‘long SC data set’. They are supposed to be the longest 4-5 SC from cells where I could get good measures. These longer bivalents are useful because their patterns shouldn’t be affect by chromsome size effect (which effects, CO position). Hopefully this data set will have less noise from chromosome identity, but there was still data missing (they don’t come from whole cell measures).

Adjusting for XX

The female mouse averages should have adjustments for the XX. working on code to estimate the SC length from 3rd largest bivalent from female whole cell data across strains. Subtract this amount from the female mouse averages … This isn’t the best solution – since I can’t determine what proportion of cells for female mouse averages include the XX, (most cells are missing at least 3 bivalents)

  • Of all female single bivalents 5% are XXs (1 of 20).

  • The XX is large, thought to be within the top 25% longest bivalents of the cell (3rd largest by Mb).

  • What is the average % of XX for whole cell SC (sum(all bivalents)). Lets guess 12% of a cell’s total SC area is XX.

  • do the algebra…

Applying Mix Model framework

I will use two different mixed models using lme() and lmer(), because the different models have limitations for reporting p values ect.

list out pros and cons for using mouse averages of bivalent traits instead of using the single bivalent traits them selves.

Pro Bivalent level

  • The data are continous and closer to normal/gaussian expectations than the MLH1 counts.

  • There is within cell variance that is an issue.

Pro Mouse level averages

  • Mouse average is a conserved framework, it would summarize general patterns which fits with the paper.

  • Simple application of the same MM.

  • Limited work to add to the dataset if I can justify that the collection of single bivalents are random and equivilent across mice.

In the chunk above the mouse averages table is made – may need to add all the extra metrics (IFD, .

Predictions, Heterochiasmy

Using the Mixed model framework which tests the effects (and interactions) of subspecies, sex and strain, I will test for evolution of the following traits.

Two bivalent level traits are predicted to display heterochiasmy (ie significant effects of sex);

  1. SC length (cite Lynn)

2.A) normalized CO positions (sedell and Kirkpatrick).

  1. sister cohesion tension (sis-co-ten) will be sexually dimorphic as it reflects the general property of unifrom vs telomere/biased CO positioning.
  1. Interference, measured in the physical units, has been shown to lack sexual dimorphism (petkov).

These 3 hypotheses will be tested with the mixed model framework, specifically focusing on the effect of sex.

Mouse averages for the each metric will be used.

Predictions, male polymorphism

What predictions do I have for how traits will differ between the low and high Musc mouse strains?

  1. SC lengths will be longer for high Rec

B.1) Interfernce/IFD will be shorter in high Rec

or

B.2) Interference/IFD will be longer in high Rec males, sis-co-ten metric will be maximized. (given prelim results from Beth’s data)

  1. It’s unknown if 1CO normalized positions between high and low musc strains, will differ. (no good predictions come to mind)

  2. Sis-co-ten predictions fall into

Using the same MM framework we would expect the random strain effect would be significant for these predictions.

Use just the male (Musc male data)

\[mouse \ average \ SC \ length ~=~ rand(strain) + \varepsilon \]

\[mouse \ average \ IFD \ trait ~=~ rand(strain) + \varepsilon \] The IFD traits are:

  • IFD_ABS

  • IFD_PER

\[mouse \ average \ CO \ position \ trait ~=~ rand(strain) + \varepsilon \]

The CO position traits are:

  • 1CO position

  • centromere and telomere distance

  • sis-co-ten

SC Lengths

First draft of the SC length plot. I’d like to add the hand foci count to the x axis. (also add the PWD - male data)

TRY ADDING NESTED boxplots for hand.foci.count

\[mouse \ average \ SC \ length ~=~ subsp * sex + rand(strain*sex) + \varepsilon \]

To estimate random effects, I’m using lmer (not lme where I could do the rand(strain*sex)). So now it’s just random(strain).

Above is the model we are testing evolution of SC lengths. remember for the MM use the table mouse.avs_4MM

remeber the long biv data set in real.long.bivData

I used the mixed model with the lme() for the full Curated BivData and long bivdata

\[mouse \ average \ SC \ length ~=~ subsp * sex + rand(strain) + \varepsilon \]

## 
##  simulated finite sample distribution of RLRT.
##  
##  (p-value based on 10000 simulated values)
## 
## data:  
## RLRT = 6.6745, p-value = 0.0028
## 
##  simulated finite sample distribution of RLRT.
##  
##  (p-value based on 10000 simulated values)
## 
## data:  
## RLRT = 0.075954, p-value = 0.3162

Heterochiasmy Prediction

Is sex a significant effect for SC length? (as predicted)

  • Sex is a significant effect for SC length The results seem to indicate that sex is a significant factor. Consider writing a subsampling approach (randomize / permute a data set of BivData)

  • According to anova, sex effect explains most of the variance in single bivalent SC lengths.

  • The Long Biv Data set largely agrees with the full curated dataset

  • I caveat I haven’t addressed yet, is the XX in the female Biv data averages —

Is the random Effect of strain an effect on SC length?

The exactRLRT() test indicates that the random strain effect might be significant, p= 0.3162.

Male polymorphism Predcition

Preidction, High rec males have longer SC.

I can’t get the above mixed models to work for testing high vs low rec groups

## 
## Call:
## glm(formula = Rec.group ~ mean_SC, family = binomial(link = "logit"), 
##     data = mouse.avs_4MM_male[mouse.avs_4MM_male$subsp == "Musc", 
##         ])
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -1.41174  -0.39326  -0.15951   0.09663   2.29849  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)  
## (Intercept) -72.2866    31.3071  -2.309   0.0209 *
## mean_SC       0.8700     0.3802   2.288   0.0221 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 30.789  on 22  degrees of freedom
## Residual deviance: 12.171  on 21  degrees of freedom
## AIC: 16.171
## 
## Number of Fisher Scoring iterations: 7
long.SC.mouse.avs_4MM_male <- Long_biv_mouse.avs_4MM[Long_biv_mouse.avs_4MM$sex == "male",]
long.SC.mouse.avs_4MM_Mmale <- long.SC.mouse.avs_4MM_male[long.SC.mouse.avs_4MM_male$subsp == "Musc",]

#divide musc males into groups
long.SC.mouse.avs_4MM_male$Rec.group <- ifelse(grepl("PWD", long.SC.mouse.avs_4MM_male$strain), 1, 
                                    ifelse(grepl("MSM", long.SC.mouse.avs_4MM_male$strain), 1, 0))

long.biv.log.reg <- glm(Rec.group ~ mean_SC, 
                            data=long.SC.mouse.avs_4MM_male, family=binomial(link="logit"))

summary(long.biv.log.reg)#NS effect (likelt underpowered)
## 
## Call:
## glm(formula = Rec.group ~ mean_SC, family = binomial(link = "logit"), 
##     data = long.SC.mouse.avs_4MM_male)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -0.8824  -0.5977  -0.4916  -0.3294   2.1515  
## 
## Coefficients:
##              Estimate Std. Error z value Pr(>|z|)
## (Intercept) -12.53251    9.13091  -1.373    0.170
## mean_SC       0.10955    0.09031   1.213    0.225
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 16.574  on 18  degrees of freedom
## Residual deviance: 14.941  on 17  degrees of freedom
## AIC: 18.941
## 
## Number of Fisher Scoring iterations: 5

When all male mice are used, the predictive power is greater, than when just the Musc strains are used. When, just the Musc strain are used, The mouse mean SC is slightly significant in predicting if a mouse is in the high or low (should I consider running on female too?)

Is the prediction, high rec musc male strains have long SC met?

In a logistic regression, mouse average SC length is slightly predictive telling if a mouse is in a high or low Rec strain. I counldn’t get the Mixed models working for the male polymorphism predictions…

IFD

ToDo fix the function to calculat IFD2, IFD3

Interfocal Distance, a indication of CO interference.

How can I highlight the result (I need to verify this first) for greater IFD in high rec male mice?

The raw measures are pretty nosy, this might not be the best way to display these results.

#long SC Long_biv_mouse.avs_4MM
#real.long.bivData


#Only use 2CO bivalents
real.long.bivData.22 <- real.long.bivData[real.long.bivData$hand.foci.count == 2,]
real.long.bivData.22 <- real.long.bivData.22[(!is.na(real.long.bivData.22$hand.foci.count)),]


long.biv_IFD_look_ABS <- ggplot(real.long.bivData.22, aes(y = IFD1_ABS, x= mouse, color=strain))+geom_boxplot(alpha=.2)+
  geom_jitter(alpha=.2)+
  ggtitle("Interfocal Distances, ABS")+
  scale_color_manual(values=colors_of_strains)+  facet_wrap(sex~subsp, scales = "free")+ylim(c(20,160))+theme(legend.position="none")

long.biv_IFD_look_ABS
## Warning: Removed 6 rows containing non-finite values (stat_boxplot).
## Warning: Removed 6 rows containing missing values (geom_point).

long.biv_IFD_look_PER <- ggplot(real.long.bivData.22, aes(y = IFD1_PER, x= mouse, color=strain))+geom_boxplot(alpha=.2)+
  geom_jitter(alpha=.2)+
  ggtitle("Interfocal Distances, ABS")+
  scale_color_manual(values=colors_of_strains)+  facet_wrap(sex~subsp, scales = "free")+ylim(c(0,1))+theme(legend.position="none")

long.biv_IFD_look_PER

Above is the first plots of IFDs. Each point is a IFD observation. colors are by mouse. Maybe add mouse level lines for the means

Maybe I should try binned the SC lengths.

mixed model analysis for IFD (interference), the first set of models are made with the lme() functions.

\[mouse \ average \ IFD ~=~ subsp * sex + rand(strain*sex) + \varepsilon \]

second set of Mixed Models

These seond set of models are made using the lmer()

\[mouse \ average \ IFD ~=~ subsp * sex + rand(strain) + \varepsilon \]

\[mouse \ average \ normalized IFD ~=~ subsp * sex + rand(strain) + \varepsilon \]

I tested 2 versions of the mixed model for this flavor of trait, raw IFD and normalized IFD measure. The tables below are from anova( for the lmer model )

## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## 
##  simulated finite sample distribution of RLRT.
##  
##  (p-value based on 10000 simulated values)
## 
## data:  
## RLRT = 0, p-value = 1
## 
##  simulated finite sample distribution of RLRT.
##  
##  (p-value based on 10000 simulated values)
## 
## data:  
## RLRT = 1.9997, p-value = 0.0569
#make t.tests for pooled high and low

Heterochiasmy Prediction

Do the IFD measures lack significant sex effect? (as predicted)

There are sex effects for the lme() models for both the ABS and PER IFD measures. There is slight evidence for significant strain effect.

Are there signiticant strain effects?

The random strain effect was tested for raw and normalized IFD, neither are signifcant.

Male Polymorphism Prediction

prediction A, Is the Male polymorphism Prediction met? High rec strains have shorter IFDs?

Neither t.test are sig for both the ABS and PER when I test just the Musc strains. The above t.tests are breaking the knitr

None of the logistic regression models for ABS or PER IFD lengths are sigificant, even when just the Musc strains are used.

Post-hoc comparisons ideas?

Preliminary results from an independant dataset indecated that PWD had longer IFDs, which goes agaisnt the simple prediction of more COs ~ denser spacing of foci on the same bivalent. This also indecated that interference distance may evolved in the house mouse complex.

Prediction B, Is there evidence for the alternative IFD meansure

General CO Positions

There are a few traits that fall within the CO positions

  • 1CO normalized position

  • sis.co.ten (sorta interference)

  • centromere and telomere distance

for the long chrm, pull the object ids out of the melt table.

Normalized 1CO positions

Above plot focuses on the 1CO bivalent normalized positions – since CO interference controls the general position of COs when there are multiple COs. This plot shows the sexual dimorphism in the density plots.

consider adding annotate_text for the number of observations in each category. think about adding a vertical line for centromere, for the position means. Think about removing the extra Musc strains.

Tired runnin plots for the long biv data set, I think there are too few observations.

add boxplots for

These bosplots show that females have a much more medial poition of single foci bivalents, (much closer to 50% compared to males). They also show that Musc males’ Foci1 position is slightly more central / medial compared to the same type of positions in the Dom male strains. MOLF males have much more medial positions than other strains.

Mixed models norm 1CO positions

the distribution of SC lengths and sis-coten seems very different across sexes

\[mouse \ average \ F1 position ~=~ subsp * sex + rand(strain) + \varepsilon \]

The mixed model data should only come from 1CO bivalent data.

## Warning: package 'lmerTest' was built under R version 3.5.3
## 
## Attaching package: 'lmerTest'
## The following object is masked from 'package:lme4':
## 
##     lmer
## The following object is masked from 'package:stats':
## 
##     step
## Linear mixed model fit by REML ['lmerMod']
## Formula: norm.first.foci.pos ~ subsp * sex + (1 | strain)
##    Data: mouse.avs_4MM
## 
## REML criterion at convergence: -288.6
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.6500 -0.6255 -0.0452  0.5151  3.2887 
## 
## Random effects:
##  Groups   Name        Variance  Std.Dev.
##  strain   (Intercept) 0.0009902 0.03147 
##  Residual             0.0010440 0.03231 
## Number of obs: 82, groups:  strain, 8
## 
## Fixed effects:
##                    Estimate Std. Error t value
## (Intercept)        0.559971   0.019834  28.232
## subspMusc         -0.005967   0.025507  -0.234
## sexmale            0.135770   0.010934  12.417
## subspMusc:sexmale -0.029024   0.015016  -1.933
## 
## Correlation of Fixed Effects:
##             (Intr) sbspMs sexmal
## subspMusc   -0.778              
## sexmale     -0.290  0.225       
## sbspMsc:sxm  0.211 -0.316 -0.728
## 
##  simulated finite sample distribution of RLRT.
##  
##  (p-value based on 10000 simulated values)
## 
## data:  
## RLRT = 14.982, p-value < 2.2e-16

Siscoten

The metric Sis-co-ten measures the amount of sister cohesion connected to the other pole. Alternating IFDs. Assuming chromatid interference and no sister seperation, this should reflect the amount of sister cohesion that is pulled across the opposite pole.

The logic of how the sis-co-ten metric is outlined in the figure below. The authors from this paper call it by something else. looking at this siscoten metric is a way to model different cohesin outcomes as a consequence of different numbers and placements of chiasmata/CO. This metric is calculated from SC area – which we are using as a proxy to the amount of cohesion at metaphase.

Logic of Sis-co-ten metric, from (Lee, J. (2019). Is age-related increase of chromosome segregation errors in mammalian oocytes caused by cohesin deterioration?. Reproductive Medicine and Biology.)

Logic of Sis-co-ten metric, from (Lee, J. (2019). Is age-related increase of chromosome segregation errors in mammalian oocytes caused by cohesin deterioration?. Reproductive Medicine and Biology.)

## Warning: Removed 135 rows containing missing values (geom_point).

## Warning: Removed 46 rows containing missing values (geom_point).

## Warning: Removed 58 rows containing missing values (geom_point).

## Warning: Removed 32 rows containing missing values (geom_point).

## Warning: Removed 60 rows containing missing values (geom_point).

## Warning: Removed 44 rows containing missing values (geom_point).

## Warning: Removed 645 rows containing missing values (geom_point).

## Warning: Removed 637 rows containing missing values (geom_point).

Males have much clearer seperation of siscoten across chrm classes. This is emphasized when SC length is also plotted.

Telomere and centromere Distance

My metric for telomere and centromere distance measure the distance of the nearest foci to the ends of the bivalent (SC). In the plots below each point is a single bivalent. I choose not to use the mark for centromere because it seems noisy and inconsistant…

## Warning: Removed 82 rows containing missing values (geom_point).

## Warning: Removed 80 rows containing missing values (geom_point).

Males on average have much lower raw telomere distance (reflects the telomere bias) compared to females. In Males, 2CO bivalents have very low telomere distances, while the 1CO bivalents have a greater range. In females the ranges of telomere distances have much more overlap.

## Warning: Removed 122 rows containing missing values (geom_point).

## Warning: Removed 123 rows containing missing values (geom_point).

The normalized centromere plots show that in Musc males, on 2CO bivalents the 1st CO is closer to the centromere end than in Dom males.

Females have more overlap in the distributions of centromere distances across chromosome class compared to males.

Heterochiasmy Prediction

Is sex a significant effect for the 1CO normalized CO position? (as predicted)

is random strtain effect significant?

The random strain effect seems very significant.

remember to use the mouse average table mouse.avs_4MM (I don’t think I need the MELT data frame)

#make point plots / boxplots which show differences in mean positions 

#scatter + boxplot for t.tests

Dr. Broman suggested that the Smirnov K /(curve comparison) wasn’t the best test to differences in general CO position. He sugggested doing simple t-test for the positions

#try remaking the plot Megan suggested
# for 2CO positions, Foci1, Position  on x and Foci 2 position on y

CurBivData_2CO <- Curated_BivData[Curated_BivData$hand.foci.count == 2,]

CurBivData_2CO <- CurBivData_2CO[!(is.na(CurBivData_2CO$Foci2) | CurBivData_2CO$Foci2==""), ]

#isolate 2COs
#facet by sex and subsp

F1.x.F2 <- ggplot(CurBivData_2CO, aes(x=Foci1,y=Foci2, color=strain) ) + geom_point()+ facet_wrap(~sex)+ggtitle("test plot")
F1.x.F2

#what is the pattern of variance
#run analyses for each subsp*sex
#use non-melt DF

#how is the variance partioned across
#cell, mouse, strain

female.Dom <- Curated_BivData[Curated_BivData$sex == "female",]
female.Dom <- female.Dom[female.Dom$subsp == "Dom",]

female.Dom$Foc1.PER <- female.Dom$Foci1 / female.Dom$chromosomeLength

#unorder strain and mouse

female.Dom$mouse <- as.factor(female.Dom$mouse)


female.Dom$strain <- unclass(female.Dom$strain)
female.Dom$strain <- as.factor(female.Dom$strain)

female.Dom_1CO <- female.Dom[female.Dom$hand.foci.count == 1,]
female.Dom_1CO <- female.Dom_1CO[(!is.na(female.Dom_1CO$hand.foci.count)),]

#1CO first
modo <- lm(Foc1.PER ~ fileName + mouse + strain, data=female.Dom_1CO)

#can't get mouse and strain to have sum of square
#residual size decreases with per.F1
#residuals much larger than fileName, mouse and strain no 

#model <- lm(breaks ~ wool * tension, 
#            data = warpbreaks, 
#            contrasts = list(wool = "contr.sum", tension = "contr.poly"))

male.Dom <- Curated_BivData[Curated_BivData$sex == "male",]
male.Dom <- male.Dom[male.Dom$subsp == "Dom",]

male.Dom$mouse <- as.factor(male.Dom$mouse)

male.Dom$strain <- unclass(male.Dom$strain)
male.Dom$strain <- as.factor(male.Dom$strain)

male.Dom <- male.Dom[male.Dom$hand.foci.count == 1,]
male.Dom <- male.Dom[(!is.na(male.Dom$hand.foci.count)),]

male.Dom$Foc1.PER <- male.Dom$Foci1 / male.Dom$chromosomeLength



male.modo <- lm(Foc1.PER ~  fileName | mouse | strain, data=male.Dom)
summary(aov(male.modo))

#only file name is registering as effect
#Review ANOVA frameworks
#http://www.biostathandbook.com/nestedanova.html

Male Polymorphism Prediction

I didn’t have a good prediction for the male polymorphism… So I’ll just do post-hoc comparisons for the groups to test if they are different.

#CO_pos

old drafts

I went on a kick thinking of the sis-coten metric, I’m not so sure this is a good metric anymore. Do paired t-tests for all the male strains; test is foci position distributions are different.

1CO(F1) 1CO(F1)

2CO(F1) vs 2CO(F1) 2CO(F2) vs 2CO(F2)

Predictions for sis-coten stabilizing selection

Because it enables tighter regulation of the spindle assembly checkpoint (SAC) the centrosomic metaphase spindle in male gametogenesis can impose selection on relative CO placement (broad scale REC landscape). (OR generally due to the stronger sensitivity of the SAC in sperm/males). Following certain assumptions, the axis length and relative positions of COs can change the amount sister-cohesin over which tension is generated on the metaphase I spindle. This can be thought of as the bivalent structure when they are in the process of bipolar orientation.

under strong purifying selection, (the idea is that bivalent structures evolve to efficienctly pass the SAC) (there might be mulitple ways to do this)

  • similar amount of sister cohesion tension -related to the rate of sister cohesion degregation. So all bivalents seperate at a similar rate (enter anaphase and complete anaphase in a synchronized manner).

  • tension for SAC detection

  1. In MSM and PWD males (the high recombining strains), 1COs should have less of a telomere bias compared to the other low rec mice (IFD).

The MSM and PWD males have approximately 50% 1CO, 50% 2CO compared to lower recombining strains (Dom and musc) 75% 1CO, 25% 2CO.

If the metaphase spindle / SAC evolved a new optimal amount of sister cohesion for the spindle tension to be spread over, the parameters of relative co placement and CO interference can evolve so that they are similar across .

In terms of percent position, PWD, LEW and MSM have similar mean Foci1 pos. LEW is sort of the odd one out. Interestingly, LEW seems to have stronger centromere supression compared to PWD and MSM (the distribution of foci positions is larger in MSM, PWD and KAZ).

  1. Sister cohesin tension area for 1COs should trend more similar to 2COs in high recombining strains.

what’s the rational for this prediction?

Mouse is shown on the x axis. It seems like PWD and MSM might have a higher variance in the range od siscoten values for 2COs, but overall the averages sis-co-tens across the strains for 2CO are very similar across all of the mice.

3. Lack of strain specific patterns in females

the vls are much closer in females

SC length traits

Females have higher variance in more meiotic traits

The above plots seperate bivalent classes. (Make a table of pvalues for t.tests of male to female SC lengths across strains.)

##                   Df  Sum Sq Mean Sq F value Pr(>F)    
## hand.foci.count    1  929673  929673 1826.68 <2e-16 ***
## sex                1 1225522 1225522 2407.99 <2e-16 ***
## strain             7  147348   21050   41.36 <2e-16 ***
## Residuals       9553 4861911     509                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

By ANOVA analysis, everything has a significant effect.

PERMUTATIONS Accounting for sample size differences

Currently the category with the fewest observations is G female (~30). I also have may more Kaz male bivalent measures (1000)

Var1 Freq
WSB female 765
WSB male 592
G female 725
G male 1004
LEW female 714
LEW male 533
PERC male 0
PWD female 1030
PWD male 678
MSM female 549
MSM male 173
MOLF female 0
MOLF male 437
SKIVE female 488
SKIVE male 775
KAZ female 485
KAZ male 615
CZECH female 0
CZECH male 0
AST male 0
TOM male 0
CAST female 0
CAST male 0
HMI female 0
HMI male 0
SPRET female 0
SPRET male 0
SPIC female 0
SPIC male 0
CAROLI female 0
CAROLI male 0
F1 0
other 0

Below is code where I implement a subsampling approach/permute dataset with randomly sampled bivalent observations. 25 random bivalents are sampled from each category, and this is permuted 1000 times.

The plots shown below are histograms of p values from t.tests for SC length across the labeled categories from the replicated/subsampled dataset.

Modeling SC lengths at the bivalent level is noisey. There are many levels and unequal sampling.

maybe I should compare the longest bivalent from a cell? or the shortest 4?

Centromere supression

Anecdotally I would say that males have stronger centromere supression. I will also add a metric for telomere distance.

Intesting that the distance from centromere is sexually dimorphic for 2CO bivalents but not 1CO bivalents. Females have a stronger centromere supression, strange. Did I calculate this wrong? This indecates that female has stronger centromere supression, but only for 2CO bivalents…

Verbalizing the inference from all the p values.

Using the same permuted / subsampled data frame – I’ll look at centromere and telomere distance

How would these distributions change normalized distances?

t.tests for centromere dis and telomere distance This result and the IFD could be the same mechanism.

Higher recombining males have stronger interference

Do the same figure things for IFD

There might be too little 2CO observations for many dom chrm to get the IFDs

Var1 Freq
WSB female 187
WSB male 107
G female 242
G male 183
LEW female 245
LEW male 106
PERC male 0
PWD female 285
PWD male 301
MSM female 159
MSM male 91
MOLF female 0
MOLF male 97
SKIVE female 113
SKIVE male 253
KAZ female 128
KAZ male 84
CZECH female 0
CZECH male 0
AST male 0
TOM male 0
CAST female 0
CAST male 0
HMI female 0
HMI male 0
SPRET female 0
SPRET male 0
SPIC female 0
SPIC male 0
CAROLI female 0
CAROLI male 0
F1 0
other 0

The

more variation across traits for females than males

High recombining male mice have more total SC?

I think with total SC, this is true, but it doesn’t seem true for the single bivalent data. This could also be due to sampling issues.

Interfocal comparisons

The above plots shows the distribution the raw and normalized distribution of the first Inter Focal Distance.

Sis Co ten Metric